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ABSTRACT 

The CP- violating version of the Minimal Supersymmetric Standard Model (MSSM) is 
an example of a model where experimental data do not preclude the presence of light 
Higgs bosons in the range around 10 - 110 GeV. Such light Higgs bosons, decaying 
almost wholly to bb pairs, may be copiously produced at the LHC but would remain 
inaccessible to conventional Higgs searches because of intractable QCD backgrounds. 
We demonstrate that a significant number of these light Higgs bosons would be boosted 
strongly enough for the pair of daughter 6-jet pairs to appear as a single 'fat' jet with 
substructure. Tagging such jets could extend the discovery potential at the LHC into 
the hitherto-inaccessible region for light Higgs bosons. 
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1 Introduction: Survival of Light Higgs Bosons 



After four decades of relative stagnation, the physics of elementary particles has again 
reached a stage when empirical discoveries could lead the day. Theoretical structures built 
with much care and ingenuity over the previous decades now lie in serious danger of collaps- 
ing under the onslaught of data from the Large Hadron Collider (LHC). The most important 
of these theoretical structures under test is, of course, the Standard Model (SM) of particle 
physics. The first - and, it turns out, most stringent - test of the SM lies in the discovery of 
the predicted P-H] Higgs boson H^. It is now universally known that a discovery has been 
made [5] of a boson of mass around 125 GeV which seems to resemble the Higgs boson of 
the SM [6j, but further precise measurement of its couplings are required before it can be 
established as the SM Higgs boson in a manner convincing to all. The situation is expected 
to become much more clear at the end of the current year, after more data is collected and 
analysed. 

Whatever be the outcome of these Higgs boson measurements, the issue of light Higgs bosons 
(i.e. masses < 100 GeV) will still remain wide open, so far as the LHC is concerned. This 
is because signals for a light Higgs boson decaying principally to 6-jets will be completely 
swamped by the QCD background for production of bb pairs (and other dijets) - which is 
to say that such light Higgs bosons will be, for all practical purposes, invisible at the LHC. 
At a first reading, the above argument might seem to be purely academic, in view of the 
definite bound of Mh ^ 114.4 GeV reported by the LEP-2 electroweak working group [7]. 
This is certainly true of the SM Higgs boson. However, we must remember that this bound 
arises from the negative results of searches for the Higgstrahlung process, viz., 

e+ + e- Z* ^ Z° + {£+£-) + (bb) (1) 

where i = e, fi or t, and is, therefore, strongly dependent on the HZZ coupling. This 
coupling is, of course, fixed in the SM, and therefore the lower bound of 114.4 GeV is also 
fixed in the SM. On the other hand, any model with new physics beyond the SM (BSM) 
which can accommodate a significantly smaller HZZ coupling will immediately evade this 
bound. Thus, we can even now ask the question whether the observed new particle is the 
only Higgs boson, or is it one of a pack of scalars in some BSM physics model, where the 
others are invisible at the LHC because they are light? 

Whenever new physics beyond the SM is indicated, the usual model of choice is the Minimal 
Supersymmetric Standard Model (MSSM) or one of its many variations. Even in the MSSM 
with all real and CP-conserving parameters, the lower bound on the lightest Higgs boson 
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mass (93 GeV for tan/3 > 6) is different from that of the SM Higgs mass bound [8]. Not 
surprisingly, even this bound can be relaxed considerably in the presence of CP violation 
in the Higgs sector [9], where the lightest Higgs boson h'^ can have a substantial CP-odd 
component - which immediately implies a highly suppressed h^ZZ coupling (see below). 

Though an overwhelming majority of studies of the MSSM have assumed that the super- 
symmetric parameters are real and CP conserving, it may be recalled that of the 105 new 
undetermined parameters in the MSSM, only 62 of them are CP-conserving, while as many 
as 43 are CP- violating. Many of these CP- violating phases cannot be rotated away by sim- 
ple re-definition of fields and this is known to lead to new sources of CP violation which 
may come in useful to explain the observed level of baryon asymmetry in the Universe [10] . 
However, these additional phases, especially those involving the first two generations, lead to 
large electric dipole moments (EDMs) of and u^. as well as of mercury (Hg) atoms - which 
come in conflict with the experimentally measured upper bounds on these EDMs [TlHT3] . 
The new CP phases may, therefore, be expected to be severely constrained. Fortunately, 
it turns out that such constraints are strongly dependent on the CP-conserving model pa- 
rameters - specifically the nature of the mass spectrum involved in the calculation of those 
EDMs - and it has been explicitly shown that substantial cancellations among different EDM 
diagrams end up in allowing some combinations of the CP- violating phases to be large |14j . 
For example, the EDM constraints require the phase 0^ of the higgsino mass parameter 
fi = \fi\e'^^'^ to be generally constrained to (p^ < 10^^ (unless we set the masses of all the 
sfermions very high). But if the sfermions of the first two generations are of the order of a 
few TeV [15] and we do not assume universality of the trilinear scalar couplings Af [T6|[T7]. 
then can be considerably larger even if we keep the sfermions of the third generation light 
enough for easy detection at the LHC. The presence of CP-violating phases can substantially 
modify Higgs boson and superparticle production at colliders as well as decay modes and 
this has been the subject of several investigations [18] in the context of collider signals. 

Though the new CP-violating phases appear only in the soft supersymmetry-breaking pa- 
rameters ^1 Af and the three gaugino masses Mi, M2 and M3, some of them can induce CP 
violation at the one-loop level in the Higgs potential even if the tree-level Higgs potential 
is CP-conserving [19H23]. The quadratic terms for neutral states in this one loop-corrected 
Higgs potential can be written in terms of a 3 x 3 mass-squared matrix A^^^, where non-zero 
off-diagonal terms (a 7^ h) involve mixing between scalar and pseudoscalar states. This is 
unlike the CP-conserving MSSM, where the scalar states {hP , H^) and the pseudoscalar state 
A^ do not mix. After diagonalisation, the physical neutral Higgs states and (in 
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ascending order of mass) become admixtures of CP-even and CP-odd states. Since the pseu- 
doscalar states do not couple to ZZ pairs, it is obvious that the h^ZZ couphngs {i = 1, 2, 3) 
arise only from the CP-conserving components of and therefore will be suppressed by the 
corresponding mixing angles. There exists enough freedom in choice of parameters in this 
CP-violating MSSM for at least some of these states to be light — which would make 
them invisible to conventional searches at the LHC. In fact, a set of benchmark points have 
been defined to showcase the maximal effect of CP violation in the MSSM Higgs sector, and 
this set goes by the name: 'CPX scenario' [23] . 

The LEP collaborations have searched for the processes 



in the CP- violating MSSM Higgs sector based on the CPX scenario [25]. For certain choices 

of CP-violating parameters within the CPX scenario, the LEP-2 data allow for a much lighter 
Higgs boson with a mass M{hi) ^ 40-50 GeV [SIESIEZ] because, as expected, there are 
very substantial reductions in the h\ZZ coupling |23] . It turns out that for the selfsame sets 
of parameters the couplings h^WW, h^ZZ and hiti all get reduced simultaneously, as a result 
of which none of the canonical search channels for at the Tevatron and LHC are expected 
to be viable [261I281 - I30] . This implies that there is a 'blind spot' or 'hole' in the parameter 
space which is permitted by all experimental data till date. Different search strategies in the 
future runs of the LHC have been proposed to close this 'blind spot' [28 |[3T1433] . These have 
varying levels of success, depending on the specific choice of parameters, but none can be 
said to be the definitive search strategy for light Higgs bosons belonging to this inaccessible 
region. 

In this work, we explore a method of searching for these light Higgs bosons which is based 
essentially on kinematics, and hence is not overly sensitive to the parameter choices of 
the theory. Our strategy is to apply to the case of the light Higgs bosons of the CP- 
violating MSSM some of the techniques developed recently for tagging a heavy boosted 
particle decaying to a single fat jet with substructure [331439] . Since the h^, h!^ and can 
all decay to a pair of highly-boosted 6-jets, one could ask if this technique can suitably 
identify the Higgs boson states over and above the enormous QCD background at the LHC. 
Daunting as the task may seem at first, we demonstrate that this technique, in fact, works 
quite well, and can be considered as an important probe of the CP-violating Higgs sector. 



' Z* ^ Z^ + h'l i+i- + bb 
h^^ + h^^ 46 
/i? + /i^ ^ 3/1? -> 6b 
^ Z^ + hl^ Z^ + 2hl {i+i 



)+46 



(2 = 1,2,3) 
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The basic technique can, in fact, be used in the wider context of hght scalar states, such as 
those considered recently in Ref. [lO] . 



2 The Higgs Sector of the CP- violating MSSM 

As already mentioned in the introductory section, the non-vanishing phases of fi and/or 
the trilinear scalar couplings At and Ai, can induce explicit CP violation in the Higgs sector. 
Since the only Yukawa interactions of the Higgs bosons which can have any significant effects 
are those to top and bottom squarks, this feature is reflected in the trilinear couplings as 
well, and thus the only relevant CP phases are 0^ = Arg[/i], 0j = Arg[y4j] and 0f, = Arg[Af,]. 
Given these, the scalar potential, even though invariant under CP-transformation at tree 
level, receives CP-violating contributions through one-loop corrections. 

We briefly review the formalism required to include CP-violating effects in the MSSM. As is 
usual, we write the two scalar doublets as gauge eigenstates in the form 

^' ] t.^f^' + '^M (2) 



Once we allow the parameters in the scalar potential to have CP-violating phases, the mass 
matrix for the neutral scalars assumes the general form 
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This 4x4 mass matrix is partitioned into 2x2 blocks, with independent Ml, Ml and 
MIp — of which the last is absent in the CP-conserving MSSM but is generated in the CP- 
violating MSSM through the one-loop corrections mentioned above [T9H2lj. The magnitude 
of different contributions to the terms in the 2x2 matrix Mlp may be estimated as [20j : 



M 



SP 



O 



M,^\fi\\A\ 
w2327r2M|ugY 



sin $cp X 



6, 



At 



sin2<l>cp Mi II /i 



^susY tan /3M|usY sin <l>cpMj 



SUSY 



(4) 



where $cp = Arg(y4t/i), v = 246 GeV. and the mass scale Msusy is defined by 
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Rough estimates of the degree of CP violation in the Higgs sector can be formed by taking 
the dominant one(s) of these contributions. For example, from the above expression it is 
clear that a sizeable scalar-pseudoscalar mixing is possible for a large CP-violating phase 
$cp, and \At\ > Msusy- 

The diagonalisation of this 4x4 mass-squared matrix can be carried out in two stages, of 
which the first is to simply diagonalise the sub- matrix A^p and replace the pseudoscalars 
?7°, with the more familiar G°, A^. It turns out that after this is done, the first row and 
column of the 4x4 mass matrix are left only with contributions from tadpole diagrams, 
which would be removed in any renormalisation programme. It follows that, apart from a 
massless Goldsone boson which does not mix further with the other neutral states and is 
eventually absorbed by the massive boson, we obtain a 3 x 3 Higgs mass-squared matrix 
Ai"^, with a mass term 
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(6) 



where A^ is the appropriate eigenstate of A^p. Diagonalising this 3x3 symmetric matrix 



M-Ia by an orthogonal matrix O, 
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we see that the physical mass eigenstates /I'j'' ^2 ^^'^ ^3 (i^ ascending order of mass) are 
mixtures of the CP-odd Al* and the CP-even 05 ^ind (^\. These, therefore, are states of 
indefinite CP. The usual sum rules for neutral Higgs boson masses (i.e. eigenvalues of A^^) 
become much more complicated than in the CP-conserving case. Moreover, as Al' is no longer 
a physical state, the charged Higgs boson mass Mii± is a more appropriate parameter for 
description of the MSSM Higgs-sector in place of the Ma used in the CP-conserving model. 



The coupling of these new states /;,° (z = 1, 2, 3) to the weak gauge bosons and Z° will 
obviously be different from those in the CP-conserving MSSM. We can write them as 

3 
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Z COS uw — 

1=1 

where 

gh.vv = On cos /3 + sin {3 
Qh.h.z = 03i{cosf302j - sin f30ij) - {i ^ j) 
gh,H+w- = 02iCosf3 - 0iismf3 + i03i (9) 

These couphngs obey the following sum rules: 

3 

E2 2 I 1 2 / \ 

9h,vv = 1' 9h,vv+ I 9h,H+w- I = 1, 9hkVV = (^ijk9h,hjZ (10) 

i=l 

from which one can see that if two of the 9h0zz are known, then the whole set of couplings 
of the neutral Higgs boson to the gauge bosons are determined. It is interesting to see from 
Eqn. (fTOj) that in the case of large scalar-pseudoscalar mixing the suppressed h^VV coupling 
means an enhanced h\H^W~ coupling. This has been exploited to develop search strategies 
in Ref. 

Having described the formalism in which one can study the mixed CP Higgs bosons of 
the model, we require to choose the parameters of interest. We use the package CPsu- 
PERH(version 2.2) |?T] to generate the entire particle spectrum of the CP-violating MSSM 
for every given set of input parameters. It has already been mentioned that the quan- 
tity sin<l>cp/M|ugY needs to be large to support significant CP-mixing in the Higgs sector. 
As mentioned in the introductory section, the benchmark scenario dubbed the 'CPX sce- 
nario' [23] nicely showcases this CP violation since, among other things, it makes the h\ZZ 
coupling small enough to evade the LEP bounds [H1I251I2Z1- This CPX scenario may be 
summarised as the specific parameter choices 

Mo = M^ = Mr = ^ = ^ = ^ = ^ = MsusY 
Q t b ^ 2 2 2 

Arg [At] = Arg [A J = Arg [ A,] (11) 

It is important to note that this is just a benchmark scenario which allows us to find parame- 
ter choices which lie in the 'blind spot' or inaccessible region of previous collider experiments, 
and does not exhaustively cover the whole of the inaccessible parameter space. However, it 
suffices to provide a framework for predicting the existence of light, hitherto-invisible Higgs 
bosons, which is the subject of this work. In fact, instead of exploring the entire CPX 
scenario and its manifold variations, we find that it is sufficient to choose four benchmark 
points BP-1, BP-2, BP-3 and BP-4, and concentrate our exploratory studies on these points. 
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We use the well-known code HiGGSBoUNDS (version 2.1.1) |12] to ensure that these four 
benchmark points are allowed by the LEP-2 and Tevatron Higgs searches. The choice of 
parameters common to all the four benchmark points (BP) are as follows: 

• MsusY = 500 GeV and $cp = Arg[74i] = Arg[Mg] = 7r/2. The related parameters are 
then fixed using Eqn. (fTTl) . 

• The remaining phases are fixed as Arg[Mi] = Arg[Af2] = 0. 

• The gluino mass is fixed to Mg = 1.2 TeV, as are the masses of squarks of the first two 
generations; the squarks of the third generation are assumed to have masses 500 GeV. 

• The top quark mass is taken to be 173.13 GeV Of course, this is not a free 
parameter, but its exact value determines the top Yukawa coupling and hence controls 
the running of SUSY masses and couplings between the SUSY-breaking scale and the 
electroweak scale in a critical manner. 

The parameters which differ from one benchmark point to another are given in Table [H 
where a vertical line separates these input parameters from some masses calculated by the 
CPsuperH package. 
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200 
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125.7 


49.4 
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130.4 


99.7 


198.6 
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100 


200 


15 


128.0 


68.3 


111.6 


125.1 


99.8 


199.2 
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200 


400 


15 


130.0 


72.1 


113.2 


125.8 


199.8 


398.9 


4 


100 


200 


8 


140.0 


83.0 


112.7 


135.6 


99.7 


198.9 



Table 1: Our choice of benchmark points in the CPX scenario. In addition to the parameter choices, we 
present the masses of the three neutral Higgs states and /ig, as weU as the hghtest neutrahno Xi and 

the hghtest chargino Xi . The next-to-hghtest neutrahno is almost degenerate with the Xa . All masses 
are in units of GeV. 

This table illustrates very well some of the points made in the previous discussion. They 
have the common feature that the lightest Higgs boson is much lighter than the LEP bound 
and will therefore be highly boosted at the LHC. It is interesting that for BP-2 and BP- 
3, the third of the neutral Higgs bosons lies precisely in the range where the new boson 
has been found [S], but it would be premature to read too much into this until we have a 
precise measurement of the couplings. Similarly we do not concern ourselves overly with the 
relatively light charged Higgs bosons, although they do make substantial contributions [45j to 
the decay width for the process Bg — t- fi^fi^. In fact, strictly speaking, if we take the current 
upper bounds from the LHCb on this branching ratio [56], none of our benchmark points 
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would be tenablqj- For the moment, however, our interest focuses on the hght Higgs bosons 
hi (and sometimes the h2), where we shall presently show that tagging of their boosted decay 
products can help their detection at the LHC, and thereby shed some light on the parameter 
space which has hitherto been a 'blind spot' for collider experiments. 



3 Boosted Higgs Bosons and Jet Substructure 

In this section we discuss the technique used to identify light Higgs bosons decaying into 
a pair of 6-jets over and above the enormous QCD background. This is a technique based 
essentially on kinematics and hence is not specific to any model. The CP-violating MSSM 
described in the previous section acts here only as a phenomenologically-viable framework 
which can support the existence of light Higgs bosons. 

The study of jet substructure began some years ago [3^1439] with the realisation that in 
searches for new physics, the kinematic configuration of final states involving hadronic jets 
at the LHC can be very different from those studied earlier at colliders operating close 
to the electroweak scale, such as the LEP (91 - 205 GeV), the Tevatron (around 300 - 
350 GeV) and the HERA (200 - 300 GeV). At these machines, the major searches were for 
new particles of mass between some tens of GeV to a few hundred GeV, i.e. the masses 
were comparable to the machine energy. When particles of such mass are produced, they do 
not carry much momentum and hence are only mildly boosted. At the LHC, however, the 
available centre-of-mass energy is around 1-2 TeV, but the particles being sought are the 
same as before. It follows, therefore, that, if produced, these particles will be very highly 
boosted. This realisation has led to a new paradigm for studies of new physics in hadronic 
final states at the LHC. In fact, if a new particle of mass in the range of a few hundred GeV 
is discovered at the LHC then further studies of that particle would almost always include a 
strong boost. It may be kept in mind, however, that if new physics continues to elude LHC 
searches, the mass limits on these particles will eventually increase and become comparable 
to the available machine energy — thereby restoring the situation at earlier colliders and 
reinstating the techniques invented for those specific studies. The present study (and all 
similar studies) are, therefore, currently relevant because we are in the early stages of the 
LHC run. 

^However, this can be circumvented in various ways, either by going beyond the minimal flavour violation 
paradigm, or by activating some of the supersymmetric phases set to zero in our choice of benchmark points, 
without prejudice to the existence of light neutral Higgs states. This issue will be taken up in a future 
work gT]- 
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The main feature of the decay products of a boosted particle decaying into multiple hadronic 
jets is that the final states remain highly coUimated, appearing as a single fat jet. Thus, 
at the LHC, there will be VT/Z-jets [3ll[35lll8], t-jets [37l[3Hlll3] and H-jets plll9] in the 
Standard Model, and in new physics models there will be objects like charged Higgs jets [50] . 
neutralino jets [51], etc., depending on the model chosen. The invariant mass of such a jet, 
constructed by adding the momenta of all the hadronic clusters, will peak at the mass of the 
parent particle, i.e. at Mw, Mz, rrit, and so on. However, it is not enough to identify fat jets 
with a certain range of invariant mass, since there will be a substantial QCD background to 
these - enough to mask the small numbers due to production of these heavier particles. It is 
necessary, therefore, to tag the fat jets further by scanning them at higher angular resolutions 
for substructures which would betray their origin from the decay of heavy particles, rather 
than a series of gluon radiations and gluon splittings, which characterises the typical QCD 
jet. Criteria imposed on the substructure of fat jets have been used with success to tag W, 
Z and t jets, and here we apply a set of such criteria designed to identify fat jets arising 
from light Higgs bosons decaying to a pair of b quarks. 

The exact method used by us to tag Higgs boson jets closely follows that used in Ref. [36] . 
We now describe this technique, which has three stages. In our numerical simulation the 
basic processes are calculated using the well-known Monte Carlo generator Pythia [52] . 
and the jets are identified using the add-on package Fast Jet [53j. Thus, we start with a 
bunch of hadronic final states, of which some are identifiable with known hadrons, but others 
are just clumps of quarks and gluons, bound together for some short lifetimes. This is the 
theoretical equivalent of what the experimentalist would bunch of hadronic 'clusters' 

recorded in the hadron calorimeter (HCAL). As the first stage in our analysis, we identify a 
jet from these putative 'clusters', by applying the Cambridge/ Aachen (C/A) algorithm [54] , 
which operates as follows. 

• The angular distance ARij between all pairs (i, j) of 'clusters' is given by 

^Rij = ^J{yi-y,y + {ip^-ip,f (12) 

where yi = \ ln(-E'j —PLi)/{Ei+pLi) is the rapidity and ipi is the azimuthal angle of the 
i^^ cluster. This quantity, which is invariant under longitudinal boosts, is tabulated 
for all pairs {i,j) and the pair with the smallest value of AR^j is merged into a single 
'cluster', i.e. the momentum four vectors are added into a single momentum four 
vector. We thus have a new configuration with one less 'cluster' than before. 

• The above exercise is then repeated with the new configuration, as a result of which the 
number of 'clusters' is again reduced by unity. This is iterated until there remains only 
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a single four vector which has been built up by this merging process, with the nearest 
hadronic 'cluster' lying at an angular distance AR > Rq, where Rq is a predetermined 
angular width for a 'fat' jet. The synthetic four vector (P, M) built up by this process is 
identified as that of a jet, and the three-momentum P of this jet is identified as the jet 



axis. The invariant mass of this jet is simply constructed by taking M = y Pq — P"^. 
For Higgs boson masses in the range of a few tens of GeV the entire Higgs jet algorithm 
works best if we set Rq = 0.6 and this is what is done in the rest of the discussion. 

• One then iterates the above procedure for all hadronic clusters lying outside the cone 
AR = Rq centred around this jet axis, i.e. for this part of the study, the four-vector 
already identified as a jet will be excluded. It is likely that the remaining clusters will 
again combine to form a different jet under this algorithm, or more than one jet, as 
the iterative process continues. 

• Some of the jets thus synthesised may be soft, with Pt lying below 20 GeV. These are 
usually discounted and only hard jets {Pt > 20 GeV) thrown up by the C/A algorithm 
are involved in the remaining part of the analysis. 

The C/A algorithm described above ^generically leads to a multi-jet final state depending on 
the number and configuration of hadronic 'clusters'. Of course, this algorithm will indiscrim- 
inately pick up QCD jets as well as those arising from heavy particle decay, so it should be 
considered only as the initial step in our search for heavy particle jets. In order that the next 
step be possible, the merging history must be stored for every hard 'fat' jet created by the 
C/A algorithm. In the second stage of our analysis, this history is then used to (partially) 
eliminate QCD jets by reversing the process of synthesis and applying the following criteria 
at the different stages of the reversal: 

• The jet — read four-momentum (P, M) — is broken into two sub-jets {p, m) and 
{p',m') by reversing the last merging performed in the C/A synthesis, which means 
that P = p + p'. The ordering should be such that M > m > m'. 

• We then check the 'mass drop' i.e. if m' < xM (where x is a smallish fraction), then it 
is likely that the corresponding softer sub-jet was a radiated gluon. Following Ref. [36] . 
we set X = |. There are two cases, viz. 



^Many experimental analyses prefer to apply the so-called and anti-fc-r algorithms, where the C/A 
distance function Ai?.y is modified by a momentum-dependent factor. For the present analysis, the anti-fcy 
algorithm would not work at all, and there is no practical advantage to be gained by the Ut algorithm. 
Hence, we confine ourselves to the simpler C/A algorithm. 
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1. If < m' < xM, then we eliminate the final step in the merging and zero in 
on the harder sub-jet with [p, m) and start again, i.e. we consider the previous 
step in which {p,m) was formed by the merging of two 'clusters', and repeat this 
analysis. For all practical purposes, the softer sub-jet with (p', m') forms no part 
of the remaining analysis. 

2. If xM < m' < M, then we assume that we have found two sub-jets of comparable 
momenta and further apply a symmetry criterion. This amounts to calculating 
AR between the two sub-jets and constructing 



The symmetry criterion is then set Ref. ^36j as y > 0.09, which is consistent with 
the value of x chosen before. In practice, passing both these criteria simply means 
that we are satisfied that the sub-jets arise from decay of a heavy particle (as 
opposed to a gluon radiation/splitting) if they share the energy and momentum 
of the composite 'fat' jet in a ratio not more skewed than 2:1. 

• The final step in identifying a 'Higgs jet' would be to demand that the two reasonably 
symmetric sub-jets picked out by the above algorithm are tagged as 6-jets. For the 
experimentalist, this would mean that these jets correspond to displaced vertices in 
the tracker. However, in a theoretical analysis, we have the luxury of knowing the 
exact components of each sub-jet, whose history can be tracked down to the parton 
level even before fragmentation. Thus, we can achieve virtual 6-tagging simply by 
looking for a 6-quark parentage for the sub-jets in question. This process, however, has 
a 100% efficiency and a very small mistagging fraction, both of which are unrealistic. 
We therefore require to import proper efficiency criteria from the existing experimental 
analyses (see below). 

The above criteria are fairly efficient in eliminating QCD jets, but are not so efficient in 
eliminating jets arising from underlying events - which are again a new problem at the 
enhanced energies and luminosities of the LHC If these are not filtered out, the Higgs boson 
mass reconstruction is adversely affected. We note that the b sub-jets will have an angular 
separation given approximately by Rj^i > 2Mh/pt ^ which is smallish for boosted Higgs 
bosons, but much larger than the angular resolution of the ATLAS and CMS detectors. 
Although the cross section for bb pairs from underlying events will scale as {Rbi)'^ [SS], it 
turns out that for Higgs masses around 100 GeV and pt ~ 200 — 300 GeV, is not so small 
that underlying events can be neglected. Therefore, there is a third stage to our analysis. 



y = 



min(p^,p'^) 



AR^ 



min(p|,,p'^) 
max(py,p'^) 



(13) 
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where the Higgs jet candidates picked out by the previous algorithm are subjected to a 
'filtering' process. This works as follows. 

• Working back as before, we decompose the constituents of the candidate jet into the 
original hadronic 'clusters'. Other jets and stable final states remain untouched. 

• We define a new angular criterion rg, and re-run the C/A algorithm to identify sub-jets 
with this criterion instead of Rq. Again following Ref. [36], we set tq = 0.2. However, 
as a check, we tried different values of this cone radius, viz. tq = 0.1,0.2,0.3 etc. 
For Higgs bosons in the mass range of interest, the maximum efficiency in choosing 
candidates for light Higgs bosons occurs for the choice Tq = 0.2, and this is what we 
present our results for. 

• Of all the sub-jets thrown up by the above algorithm we choose the three hardest and 
eliminate all the others. Of these three sub-jets, if the hardest two are 6-tagged, we 
identify the jet as a Higgs boson candidate. Note that we take three hard sub-jets in 
order not to lose the not-so-rare H — )• bbg events. 

The above filtering process removes rare QCD cases where a very hard gluon is radiated, 
as well as hard emissions from underlying events, which could have satisfied the criteria in 
the first two stages, but will fail the last one. Once these three stages of identification are 
passed, we can be sure that a reasonable fraction of the tagged jets will originate from Higgs 
bosons. Our next step, therefore, is to see how efficient this Higgs jet-finding algorithm can 
be. 

To make a specific study, we have made a Monte Carlo simulation of the following exclusive 
process: 

P + P xt + Xi 

-^Xi + h\ (14) 

where the branching ratio for X2 Xi + ^1 is set to unity. The masses of the neutralinos 
Xi and X2 set to 100 GeV and 700 GeV respectively in this toy model, and all other 
processes are switched off. The masses of the x? and X2 chosen so as to have a sufficient 
phase space for the Higgs boson to be produced with a large pr- It may be noted that it was 
not essential to have a supersymmetric origin of the Higgs bosons - a Higgstrahlung process 
p + p — > + h1, with subsequent decay of — > uu, would have yielded the same final state, 
where we have only particles decaying to bb pairs in addition to other incidental debris 
from effects like initial state radiation, final state radiation, multiple interactions, and, of 
course, a great deal of missing energy and momentum. However, the processes in Eqn. (fl^ 



12 



are preferred since they permit us to study a larger range in px for the Higgs boson than the 
Higgstrahlung process would have allowed. In this simulation, the mass of the Higgs boson 
hi is varied over the range 20 - 150 GeV and the pt is noted. A total of 50 000 'events' 
was simulated for every value of M{hi). For every bin in pr(^i), we apply the jet-finding 
algorithm described above in all its three stages and note the efficiency, i.e the ratio of the 
number actually tagged as Higgs jets to the original number of produced. This enables 
us to produce the contour plots of Figure [H 




M(h^) [GeV] p^(h^)[GeV] M(h^) [GeV] 



Figure 1: Illustrating the efficiency of Higgs-tagging of 'fat' jets by the method described in this section. 
The upper (lower) row of panels corresponds to a single (double) 6-tag among the sub-jets thrown up by 
the 'filtering' algorithm. The two panels on the left illustrate contours of the efficiency of this algorithm in 
the plane of pt(^i) versus M{hi) assuming 6-tagging is perfect. The panels in the middle show the effect, 
for M{hi) = 100 GeV, of including the 6-tagging efficiency with its pr-dependence. The panels on the right 
show the results of convolution of the 6-tagging efficiencies with those shown in the panels on the left, thereby 
producing a more-or-less realistic estimate of the actual efficiencies obtainable at the LHC. 

Some explanation of the plots in Figure [1] is required at this stage. Let us first consider the 
upper row of panels in Figure [H We start with the upper panel on the extreme left, labelled 
'Single 6-tag with r/f, = 1.' This shows the result of implementation of the above algorithm 
with just one difference: after 'filtering', we pick up three hard jets and of the hardest two, 
only one is required to have a 6-tag. The plot shows contours of efficiency for the Higgs-jet 
finding algorithm, with the 6-tagging efficiency set to unity. It may be seen that for a given 
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Higgs mass M{hi), the efficiency is small for small pt, increases as pt increases, and then 
falls again as pt increases to very large values. This behaviour is exactly what one would 
expect as the boost parameter increases. When the boost is very small, the jets arising 
from Higgs boson decay tend to be widely separated, and hence jets found within a cone of 
radius Rq = 0.6 will not show any substructure, as they essentially arise from the decay of a 
single 6-quark. As the boost increases, more of the jets will now get collimated and we will 
begin to see the phenomenon of a fat jet with substructure, to which the finding algorithm 
is tuned. Not surprisingly, the efficiency will increase in this regime. However, when the 
boost is very large, the final states will form a very thin pencil and it may not be possible 
to resolve individual sub-jets using the criterion tq = 0.2. In this case, the efficiency will 
again fall. Increase in the Higgs mass M{hi) is equivalent to scaling down the pt, and this 
is reflected in the contours shown in the figure. It augurs well for the success of this method 
that there is a substantial range of Higgs mass and Higgs pt where the efficiency can be as 
high as 70% or above, and that much of the pxihij-Mihi) plane corresponds to an efficiency 
of 50% or more. 

As mentioned before, it is unrealistic to take the 6-tagging efficiency as r/f, = 1. Following 
the experimental collaborations [56] , we choose the pT-dependent efficiency as 

' for < 20 GeV 

_ I 0.3 for 20 GeV < < 50 GeV 

~ I 0.6 for 50 GeV < < 400 GeV 

^ 0.2 for Pt > 400 GeV 

This must be convoluted with the non-realistic efficiency shown in the upper left panel of 
Figure [H to obtain a more realistic picture. The results of such a convolution are shown 
in the central upper panel, marked 'Single 6-tag' for M{hi) = 100 GeV. Here, it is easy to 
see how the efficiency is reduced quite dramatically by the requirement of 6-tagging. What 
we gain in return, of course, is a much sharper drop in the background (not shown). The 
panel on the upper right of Figured! which is marked 'Single 6-tag with rji, ^ 1', shows the 
contours of Higgs tagging efficiency after convolution with the 6-tagging efficiency. This is a 
much more realistic estimate of the efficiency, where one notes that the poorer results at high 
Pt arise because of the weakness of the 6-tagging algorithm in that regime. There is still, 
however, a substantial range of Higgs mass and pt where the efficiency is 30% or more, and 
this is enough for the Higgs tagging algorithm to yield positive results, as we shall presently 
demonstrate. 

Obviously, if we demand that hoth the harder sub-jets be 6-tagged, we will get even lower 



14 



efficiencies. The advantage, on the other hand will be to have almost negligible mistagging 
probabilities. The lower three panels in Figure 1 illustrate the efficiency obtained by tagging 
both 6-sub-jets. On the left, we plot contours of efficiency for a 'Double 6-tag with rit = V, 
where efficiencies above 55% may be achieved in the optimum range of Higgs mass and 
transverse momentum. The central panel, marked 'Double 6-tag' illustrates, like the panel 
above it, the effect of implementing realistic 6-tagging on the efficiency contours for M{hi) = 
100 GeV. Not only does the overall efficiency fall below some 20% in the best case, but there 
is a sharp falling-off at high pt, as expected. The realistic contours are shown in the panel on 
the lower right, marked 'Double 6-tag with rjh ^ 1, and here we have a rather modest region 
even if we demand an efficiency of 10%, with only a small region revealing an efficiency of 
more than 15%. 

In view of the efficiency plots presented in Figure [T] and discussed above, we suggest that 
a single 6-tag may be a more efficient way of identifying Higgs jets than the double 6-tag. 
Though our analysis will take both the cases into account, it will be seen that the single 
6-tag is better for the earlier runs of the LHC, while the double 6-tag will serve to clinch the 
discovery - if made - in the later higher luminosity runs of the same machine. 

A final comment is in order before we end this section and go on to our specific results. The 
analysis in this section, though showcased using the supersymmetric process in Equation 
is really quite model- independent and the efficiency plots in Figure [Dare completely general, 
independent of the processes that produce the light Higgs boson h^. Our analysis can, 
therefore, be extended not only to light Higgs bosons, but to any object of comparable mass 
decaying to a pair of jets. This includes the and Z° bosons, identifiable by their known 
masses, but also many exotics in some of the wide range of models which predict new physics. 

4 Bump Hunting in Boosted Jets : CPX Scenario 

Having set up the mechanism to identify boosted Higgs jets, we now apply this to the model 
of interest, viz., the CP-violating MSSM. There are several mechanisms by which the Higgs 
bosons {i = 1, 2, 3) can be produced in this model. We list some of the parton-level 
processes below. 

• Gluon fusion: g + g ^ h^, which is the major production process in the SM. Here, 
however, the produced will have very small pt- 

• Vector boson fusion: V + V ^ h^, where V = W^, Z^, another SM process, where the 
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/i° are produced in association with two highly forward jets. 

• Higgstrahlung: q + g(g') — V* — j- V + h^, where V = W"^, Z^, which is suppressed 
compared to the previous two, but produces with larger pt- 

• Associated production with top quarks: p + p—>t + i+h^, which also has a small 
cross section, but produces with large pt- 

• Associated production with gaugino states: p+p — )■ xt2+Xi^2+^i' Xi,2,3,4+Xi,2,3,4+^i 

Xi2 + Xi 2 3 4 + ^i*) which occur only in SUSY models. 

• Chargino decay: xf + where the charginos are produced directly or in cascade 
decays of squarks/gluinos. 

• Neutralino decay: ^2^3,4 ~^ Xi + where the neutralinos are produced directly or in 
cascade decays of squarks/gluinos. 

• Stop and sbottom decay: ^2 — > + and 62 — ^ &i + , where the heavier stop or 
sbottom can be either produced directly or arise as a product of gluino decay. 

• Associated production with stop states: p + p ^ ti^2 + ^1,2 + > which is similar to the 
associated production with top quarks. 

The above list is illustrative, but not exhaustive, for there can be several stages in SUSY 
cascade decays where the /i^ can be produced. All these processes are, however, taken care 
of in our numerical simulation using Pythia. 

Obviously, with so many contributing processes the LHC will produce large numbers of the 
as the run progresses. The question which interests us, however, is whether these produced 
/i^'s will be detectable using the tagging algorithm described in the previous section. For 
this, we have seen that the crucial kinematic determinant is the transverse momentum px- 
It is important, therefore, to have a clear picture of the pt distribution of the three Higgs 
scalars h^. Unfortunately, this distribution is very sensitive to the masses of the and 
the mass-splitting between the different chargino and neutralino states, i.e. on the exact 
choice of model parameters. Thus, it is very difficult to make very general statements about 
the nature of these pt distributions. Instead, we find it expedient to focus on the four 
benchmark points described in Table [U Since these are chosen to cover various aspects of 
the CP-violating MSSM parameter space and guarantee the existence of light Higgs states, 
we can expect the px distributions of the /i^ at these points to carry information about the 
strengths and weaknesses of the Higgs-tagging algorithm. 
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Process 


BP-1 BP-2 BP-3 BP-4 


4-^x1 + hi 
X2 ^ Xx ' ''"1 

^ X? + 4 


99.6 98.8 10.4 98.4 
71 2 
18.0 


h\^hl^ h\ 

1 t/Q f 1 t/1 1 / t/1 


84.1 - - - 
51.9 - - - 


K^b + b 
h\^b + b 


14.3 90.4 90.4 90.2 
43.7 90.7 90.7 90.5 
92.0 91.5 91.4 91.2 


t2 ^ h + /i? 

i2^ii + hi 


0.23 0.02 0.01 0.06 
29.2 31.2 26.9 14.7 
26.9 19.2 24.8 42.0 



Table 2: Some important branching ratios (per cent) for the four benchmark points of Table[T]in Section 2. 
Blank entries indicate that the corresponding decay is kinematically disallowed. Note that the hi always 
decay predominantly into bb pairs. 

At the LHC, the production cross-section of the Higgs bosons depends not only on their 
masses, but also on their couplings and on the branching ratios of heavier particles to final 
states involving these Higgs particles. As there are many processes, this is not easy to predict 
or to explain. Some information about the principal production modes can be gleaned from 
a study of the branching ratios of gauginos to the Higgs states and the branching ratios of 
the latter to bb states. Some of these are listed in Table [21 From the table, it is obvious that 
the can arise from the decay of heavy gaugino as well as Higgs states, and that it always 
decays dominantly to bb pairs - as we have implicitly assumed in setting up the Higgs-tagging 
algorithm. The results shown in Table H] also indicate that the benchmark point BP-3 will 
have qualitatively different features from the others, as we have already noted. 

In Figure [2l we generate the pt distributions for the three Higgs states. Black, red and 
blue histograms correspond, respectively, to the h^, h?, and states for each of the four 
benchmark points on the corresponding panel in Figure [2], as marked. Since these plots 
are generated for a theoretical understanding of the Higgs-tagging algorithm, none of the 
standard kinematic cuts used in SUSY searches have been apphed. These plots immediately 
tell us that for BP-1, BP-2 and BP-4, the lightest is the most produced of the three states, 
whereas for the BP-3, it is the heavier /ig which is produced most copiously. 

We are now in a position to correlate the px distributions of Figure [2] with the efficiency 
plots of Figure [TJ Let us first consider the panel marked BP-1. In this case, the masses of 
the Higgs states are approximately 50, 100 and 130 GeV respectively. If we consider the 
panel marked 'Single 6-tag with rjh 7^ 1' in Figure [H we note that for the lightest state, we 
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200 400 600 200 400 600 800 



p^(h,)[GeV] p^(h,)[GeV] 

Figure 2: Distributions in transverse momentum prihi) for the three Higgs bosons {i — 1, 2, 3) at the 14- 
TeV LHC. Black, red and blue histograms correspond, respectively, to the h^, /ij and h'^ states (as marked). 
The four panels in this plot are marked BP-1, BP-2, BP-3 and BP-4, corresponding to the parameter choices 
of Table [TJ No kinematic cuts were applied to generate these distributions. 

can obtain a tagging efficiency above 30% for the pt range 200 — 400 GeV. This is not the 
peak region of the pt distribution shown in Figure [21 but just to the right of it. It follows 
that the tagging algorithm will miss a fairly large fraction of the Higgs bosons, though it 
will still be able to capture a significant number. We shall see presently that this is enough 
to identify the Higgs state. For the moment, we focus on the other states, where we have to 
go to Pt > 400 GeV to obtain any useful efficiency. However, for such large values of pr, the 
production cross section for these states is at least an order of magnitude smaller than that 
of the /i5 ill the region where it can be tagged with reasonable efficiency. Thus, we should not 
expect any useful signal from the production of these states, i.e. they will remain invisible 
to our study. For such states, we will have to turn to the rare 77 decay mode, which is a 
much more difficult study and beyond the scope of this work. 

For the other benchmark points, we can carry out a similar analysis based on a combination 
of Figures [1] and [2l Despite the variation in parameters, the conclusions for BP-2 and BP-4 
are very similar to that of BP-1, and hence are not detailed here. For BP-3, however, the 
situation is somewhat different. As in the case of BP-1, reasonable efficiencies for the heavy 
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/i2 state ( ~ 113 GeV) take us into the high-pT regime, where we pick up only the tail of 
the pt distribution, whereas the lightest h\ state ( ~ 72 GeV) can be tagged with better 
efficiency in the range where it is produced more copiously. However, the overall production 
of the h\ is so much larger than that of the h\ that we should expect the final cross section 
for the h\ to be at least comparable with that of the h\. We should not expect much from 
the /ig, which is neither light enough to avail of the higher efficiencies, nor is produced in 
enough numbers to offset that disadvantage. 

If we now pass from the 'Single 6-tag with 77^ 7^ 1' to the 'Double 6-tag with rji, 7^ 1', we shall 
obtain very similar results, except that the efficiencies will drop by a factor of 3 or more. 
Hence discovery through this mode will require the collection of more data than the previous 
case. However, as mentioned before, this should be treated as secondary, clinching evidence 
once we have a definite signal in the 'Single 6-tag with rj}, 7^ 1' data sample. 

The results of Figures [1] and [2] and the arguments presented above make a reasonable case 
for using the tagging algorithm of Section 3 as a tool in a search for light Higgs bosons 
in the CP-violating MSSM. However, such a study now requires to be carried out with a 
full Pythia simulation including all relevant processes and with jet identification through 
the FastJet package, as explained in Section 3 for a toy model. We have carried out this 
study for the four benchmark points as above, after generating the SUSY mass spectrum and 
couplings using CPsuperH as incorporated in the CalcHEP package [ST]. Parton density 
functions of the CTEQ5L set [SB] have been used and the factorisation scale is set to the 
parton-level energy. The final states of interest either have multijets and large missing px 
(j/rp) or one hard lepton, multijets and large After generating each event, we have applied 
the following kinematic cuts, which are more-or-less in line with those applied in standard 
SUSY searches by the CMS Collaboration [52] • These are listed below. 

1. We select events with jets (J) having transverse momentum p'j, > 30 GeV and pseudo- 
rapidity |?7j| < 3. The identified jets will be labelled Ji, J2, ■ ■ ■ in order of decreasing 

Pt- 

2. The events must hav^ missing transverse momentum satisfying /j. > 300 GeV. To 
calculate this, we take into account all jets with p'^ > 20 GeV and pseudorapidity 
\rij\ < 4.5 and all leptons with p^ > 10 GeV and pseudorapidity \rii\ < 2.5. 

3. The leading (maximum pt) jet must have pseudorapidity |?7jJ < 1.7. 
''Here we deviate a little from the CMS analysis, which has j/j^ > 200 GeV. 
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4. The two leading jets must have p:j} > 180 GeV and > 110 GeV. 

5. We calculate an effective mass Mes = + + Pt'+ impose the condition 
Mgff > 500 GeV. Naturally, p'^^ is added only if the corresponding jet exists. 

6. The angular separation between the two leading jets and the are calculated as 
-Ri = A/(7r - + Sifl and R2 = iy/5^i~+77i^^-"^^2F, where 6(pi = - <^^| and 
6ip2 = \(pj2 ^ (^-^l- On these, we impose the conditions that Ri^2 > 0.5. 

7. The azimuthal angular separation between all jets and the ^ must satisfy the criterion 
6ipi = \(f)j - <^^| > 0.3. 

8. The azimuthal angular separation between the next-to-leading jet and the must 
satisfy the criterion 6(p2 = \4>j2 ~ > 0.35. 

9. We impose a lepton veto, i.e. the event should not contain any isolated lepton with 
p^ > 20 GeV and \ri£\ < 2.5. The isolation criteria imposed on each lepton are (a) that 
the angular separation ARgj between the lepton and every jet should not be less than 
0.4 and (6) that the sum of the scalar pt of all stable visible particles within a cone of 
radius AR = 0.2 around the lepton should not exceed 10 GeV. 





BP-1 


BP-2 


BP-3 


BP-4 


tt 


Sample 


114000 


111090 


108 720 


109 170 


11070 000 


Cut 1 


103 880 


101968 


80 624 


99 910 


7208 031 


Cut 2 


11161 


11539 


13 778 


11012 


19 360 


Cut 3 


10 489 


10 861 


12 726 


10 410 


17156 


Cut 4 


9 759 


10 041 


10 942 


9 695 


12165 


Cut 5 


9 745 


10 025 


10 896 


9 684 


11997 


Cut 6 


8 237 


8517 


9451 


8 225 


7372 


Cut 7 


5 220 


5452 


6 335 


5211 


4 697 


Cut 8 


5199 


5431 


6317 


5193 


4 679 


Cut 9 


3 436 


3 814 


5 089 


3 501 


1880 



Table 3: Kinematic cut flow table for a luminosity of 30 fb^^ at each of the benchmark points BP-1 to 
BP-4. The cuts are numbered as in the text. Note the drastic effect of Cut 2, which requires ji^^ > 300 GeV. 
The efficacy of these cuts in removing the enormous ti background is immediately obvious. 

It is interesting to see how the raw signal plotted in Figure |2] is affected by the different 
cuts enumerated above. The total SUSY cross sections are rather large, being at the level 
of 3.8 pb, 3.7 pb, 3.6 pb and 3.6 pb for the benchmark points BP-1, BP-2, BP-3 and BP-4 
respectively, but are dwarfed by the tt cross section which is 369 pb at the lowest order. The 
effects of the different cuts on the cross section are displayed in Table El where we assume 
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an integrated luminosity of 30 fb^^. The numbering 1-9 of the cuts is exactly as in the text 
above. The crucial role of the Cut 2 (on p^^) is obvious, and may be taken as a justification 
of the choice of a stronger cut than that of the CMS collaboration. It is also interesting to 
note that it is the lepton veto which finally reduces the ti background to a tractable value 
without affecting the signal events quite as strongly. 
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Figure 3: Bin-wise invariant mass distribution of the leading jet with single (double) 6-tags in 10 (30) fb~^ 
of data at the 14 TeV LHC. The shaded histogram represents the tt background. Note the clear resonances 
corresponding to the lightest Higgs boson hi for the benchmark points BP-1, BP-2 and BP-4 (marked on 
the respective panels). For BP-3 and a single 6-tag, we find a modest resonance corresponding to the /i", 
while, for a double &-tag, there is only a tiny bump corresponding to the h*^- 

The results of our numerical study are presented in Figure O where we have plotted distri- 
butions in the invariant mass of the leading jet (identified as a Higgs jet) by the algorithm 
of Section 3. In every panel, the unshaded histograms represent the signal and the shaded 
histograms represent the ti background, which is the dominant SM background. Each row of 
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panels corresponds to one of the benchmark points of Table [H as marked on the correspond- 
ing panels. The left (right) column corresponds to studies with a single (double) 6-tag. To 
compensate for the lower efficiencies in the latter case, we use a higher luminosity of 30 fb^^ 
instead of the 10 fb~^ used for the former. 

The results for BP-1, BP-2 and BP-4 are very similar: in each case with a single 6-tag we 
get a clear peak corresponding to the resonance. Taking the tt background alone, and 
the single bin of interest, this would be a deviation from the SM prediction at the level of 
around lOo" in each case — which means that a discovery can be claimed in the very early 
stages of the 14 TeV LHC run. A similar statement can be made for the double 6-tags with 
our higher luminosity assumption. For the benchmark point BP-3, with a single &-tag, a 
large and broad excess region can be found, with a small peak corresponding to /i^- On 
the other hand, for a double &-tag, the same benchmark point reveals rather disappointing 
results, with the signal hardly distinguishable from fluctuations in the ti histogram. 

On the whole, therefore, we can conclude that prospects for detecting the lightest Higgs 
boson of the CP-violating MSSM by tagging 'fat' jets are rather bright. The single 6-tag 
alone gives very clear resonant peaks in the invariant mass distribution of the 'fat' jets, and, 
if the parameters are favourable, one can expect confirming signals using the double 6-tag 
method as well. It is only fair, however, to mention that our estimation of the background 
is somewhat crude and is limited to the ti signal at the leading order (LO). We know, of 
course, that addition of next-to-leading order (NLO) corrections to the ti cross section can 
increase the cross section by a factor of 2 or more. In that case, the background indicated 
by the shaded histogram in the figure will grow by a corresponding factor and the resonant 
peaks of the signal may not stand so tall above background as they appear in Figure Ei 
However, this is no reason to be disheartened, for (a) the Higgs boson production modes will 
also receive comparable enhancements when NLO corrections are added, and (6) what will 
be observed will be the sum of signal and background events, and if this is compared with 
the random fiuctuation in the background, the single 6-tag signal with 10 fb^^ of data will 
still be at a level of more than 5a, which is all that we require for a discovery. 

Before we end this section it is appropriate to comment on other backgrounds, apart from 
those due to ti production. The standard cuts devised for SUSY searches and applied to 
this signal will effectively remove backgrounds arising from the production of electroweak 
bosons, as may be guessed from the complete absence of peaks in the vicinity of Mw and 
Mz in Figure |3] — unless, indeed, a light Higgs boson happens to lie just there. However, 
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there will be a pure QCD background, which, despite drastic reduction by the same cuts 
and a very small mistagging probability, may still be expected to contribute something to 
the background because the initial cross section is many orders of magnitude larger than the 
signal. In this work, we have not attempted any detailed estimate of the QCD background or 
the mistagging probability, but a rough estimate based on the results of Ref. [59] shows that 
the QCD background will not be more than a few femtobarns, whereas our signal, even after 
application of all cuts is around 100 fb. Another possibility is that of backgrounds arising 
from the production and decay of supersymmetric particles, which we have not estimated, 
but surely forms some fraction of what we have identified as the signal. Once again, we have 
not attempted a detailed study of these, but we can guess from the fairly tall and sharp 
resonances that we seem to be predicting for the light Higgs states, that these backgrounds 
will not really prove a difficulty when the experimental data are available. 

5 Beyond the CPX Scenario 

We have seen in the previous section that one can use jet substructure analyses very effec- 
tively to probe hitherto-invisible Higgs bosons in the CPX scenario, as exemplified though 
our choice of four benchmark points. However, one can now ask whether a similar analysis 
can be used in a more general context than the CPX scenario, which itself is a benchmark 
created to showcase a 'blind spot' in previous collider searches for the CP-violating MSSM. 
A detailed answer of this question would require a detailed scan of the parameter space to 
determine the exact extent of this collider-invisible region - a lengthy and tedious business, 
given the large number of parameters which come into play when we allow the MSSM to 
become CP- violating. However, a partial answer, at least, can be sought by again picking 
up a couple of benchmark points, BP-5 and BP-6, which are do not conform to the CPX 
parametrisation, but are nevertheless part of the 'blind spot', i.e. invisible to collider searches 
using conventional strategies. 



BP 


Ml 


M2 


tan /3 




^^ 




M{h\) 


M{hl) 


M{hl) 


M(x?) 


M{xf) 


5 
6 


100 
150 


200 
400 


8 
11 


130 
135 


2400 
2000 


7r/2 
37r/4 


37.2 
56.8 


110.4 
117.3 


132.5 
127.7 


99.8 
149.8 


199.1 
398.7 



Table 4: Benchmark points lying outside the CPX scenario, but within the 'bhnd spot' for coUider searches. 
Free parameters not exphcitly given in the table are taken identical with those in the CPX scenario of the 
previous section. All masses are in units of GeV. 



As before, we exhibit our choice of benchmark points in Table HI where, in addition to the 
free parameters, we display part of the mass spectrum. The choice of BP-5 is dictated by a 
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desire to make the lightest Higgs boson hi as hght as possible - and here it can be as low as 
about 37 GeV. The mass spectrum for BP-6 is rather similar to that of BP-3. The crucial 
difference here lies in the branching ratios exhibited in Table El where we note that the X2 
can decay with a reasonable branching ratio into the Xi plus any one of the three Higgs 
states h^, h^ and h^. In this case, we could perhaps have comparable numbers of all the 
three Higgs states produced and look for all of them together using our tagging algorithm. 



Process 


BP-5 BP-6 


XI^X\ + hi 
^ X? + hi 


99.7 61.9 
23.8 
13.9 


hl^h\ + h\ 
hl^hl + hi 


72.5 21.2 
88.4 30.5 


hl^h + h 
hl^h + h 
h^-^b + b 


23.5 71.3 
10.5 63 .0 
92.4 91.7 


i2 ii + hi 
h ^ii + hl 
12 ^h + hl 


0.09 0.06 
20.2 17.5 
36.8 34.8 



Table 5: Some important branching ratios (per cent) for the four benchmark points of Table IH Blank 
entries indicate that the corresponding decay is kinematically disallowed. 

Having made this choice, we once again start by investigating the pt distribution of the Higgs 
states to get a crude impression of how effective the Higgs tagging algorithm can be. Our 
results are presented in Figure IH which resembles Figure |2] closely, except that the panels 
correspond to BP-5 and BP-6 rather than BP-1 to BP-4. The histograms in black, red and 
blue correspond to the h\, h\ and h^ states respectively. The left (right) panel corresponds 
to BP-5 (BP-6) as marked. The qualitative features of these two plots are very different: 
for BP-5, production of h\ bosons is overwhelmingly dominant over production of h^ and 
/ig, whereas the numbers are more comparable for BP-6. The reason for this lies less in the 
kinematics than in the dominant branching ratio of xl decays to the h\ in the BP-5 CcLSGj clS 
exhibited in Table [5] below. 

Once again, it is interesting to compare Figure |4] with the efficiency plots in Figure [1] 
Concentrating on BP-5 for the moment, let us note that for a h\ mass as low as 37 GeV, 
reasonable Higgs-tagging efficiencies can be obtained in the pt range 150 — 400 GeV, which 
allows us to include a little more of the peak region of the pt histogram than was the case for, 
say, BP-1. Thus we may expect a somewhat taller resonance for the h\ in this case than was 
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200 400 600 200 400 600 800 



p^(h,)[GeV] PT(hJ[GeV] 

Figure 4: Distributions in transverse momentum prihi) for the three Higgs bosons {i — 1,2,3) at the 
14-TeV LHC. Black, red and blue histograms correspond, respectively, to the h^, /ij and /ig states. The 
panels in this plot are marked BP-5 and BP-6, corresponding to the parameter choices of Table |4l No 
kinematic cuts were applied to generate these distributions. 

found for BP-1 and similar cases. In this case, the number of ^^^d h'^ produced is too small 
to expect any signals over the substantial ti background. If we carry out a similar analysis 
for the case of BP-6, we will find that all three Higgs bosons have comparable chances of 
producing a resonance in tagged 'fat' jets, though even here the hi will have a slight edge 
over the others due to its lighter mass and greater tagging efficiency. 

The above arguments enable us to make some crude guesses about the possible results, but 

it is necessary to carry out the full Monte Carlo simulation as in the previous section before 

we can say anything more definitive. In order to do this, we use the same machinery as 

described in the previous section, with the identical set of cuts and efficiency assumptions. 
The results of our numerical simulations are displayed in Figure O where, as in Figure [3l we 

plot distributions in the invariant mass of the leading jet (identified as a Higgs jet by the 

algorithm of Section 3). The upper row of panels corresponds to the benchmark point BP-5 

of Table m as marked on the panels, while the lower row corresponds to the benchmark point 

BP-6. As before, the left column corresponds to studies with a single 6-tag with 10 fb~^, 

while the right column presents the results with a double 6-tag and 30 fb~^ of data. As in 

Figure ini the unshaded histograms represent the signal while the shaded histograms represent 

the tt background. 

These results are more-or-less as expected from the previous discussion, where we took the pt 
distribution and the efficiency plot of Figured] jointly into consideration. For the benchmark 
point BP-5, with a 37 GeV Higgs state, we get tall resonances in the 'fat' jet invariant mass. 
Here the single 6-tag method would produce a signal even for 1 fb~^ of data. Of all the 
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Figure 5: Bin-wise invariant mass distribution of the leading jet with single (double) 6-tags in 10 (30) fb^^ 
of data at the 14-TeV LHC for the benchmark points BP-5 and BP-6 (marked on the respective panels). 
Note the clear hb{ resonance for BP-5 with both single and double 6-tags. For BP-6 we find small resonances 
corresponding to the h\ and h\. The shaded histogram represents the tt background. 



benchmark points considered, this is definitely the best signal we get, and thus, if we are 
lucky, even the first few months of running of the LHC at 14 TeV would throw up a light 
Higgs discovery. On the other hand, if we consider the other benchmark point BP-6, the 
signal is much weaker, but it seems just possible to detect the hb{ as well as an state. 
The evidence for the latter, as evinced in Figure |5l is not very strong, but would certainly 
improve as more data are collected. Discovery of a second Higgs boson would go a long way 
towards establishing a CP-violating Higgs sector, and more generally, a BSM Higgs sector. 



6 Summary and Outlook 

In this article, we have presented arguments to show that the existence of light Higgs scalars 
with masses below 100 GeV is by no means ruled out by existing data, except within the 
very restrictive assumptions made in the SM and in the MSSM without CP violation. As 
an example of a model where relaxation of these assumptions permits the existence of light 
Higgs bosons, we have chosen the CP-violating version of the MSSM, and argued that this is 
by no means an exotic model, but a very natural version of a supersymmetric model. In this 



26 



model, we have picked on a set of parameters known as the CPX scenario, which is known to 
he in a 'bhnd spot' of all collider searches till the present - including the LEP-1 and LEP-2, 
the Tevatron and the just-concluded runs of the LHC. To keep the discussion focussed, we 
have chosen four benchmark points in the CP-violating MSSM, all of which lie within the 
CPX scenario, and verified that these all correspond to light Higgs bosons which would be 
completely invisible to all the above-mentioned collider searches. Of course, except for our 
specific choice of the benchmark points, all of this is simply a reiteration of facts or results 
available in the literature. 

The novel feature of our work is the suggestion that the 'blind spot' of previous collider 
searches may actually be explored at the LHC using recently-developed Higgs-tagging tech- 
niques using jet substructure. Once again, such techniques have been described in the liter- 
ature earlier, and even applied to study Higgs bosons in the CP-conserving MSSM, though 
with somewhat modest results [IQ] . In our work, we simply follow the original algorithm for 
Higgs tagging, with some small modifications to the case of interest, as described in Section 3. 
The results of our study are showcased in Figure [Tj where the efficiency contours in the plane 
of Higgs pt and Higgs mass clearly show that the method is much more efficient for light 
Higgs bosons than for Higgs bosons in the range permitted by the SM and CP-conserving 
MSSM. When we apply it to the benchmark points chosen within the CPX scenario, we find 
that clearly-identifiable resonances corresponding to the lightest Higgs boson appear in the 
invariant mass distribution of all jets which pass through the Higgs tagging algorithm. For 
some points, one can even discern hints of a second Higgs boson. These encouraging results 
appear to be found whether we tag one or both of the h sub-jets resulting from the Higgs 
decays. 

Emboldened by our findings, we have extended our study beyond the CPX scenario to other 
parts of the 'blind' region, where we have selected another two benchmark points, with 
distinct features in the mass spectra and branching ratios. Here again, we have found very 
clear resonances, indicating that the same technique can be a powerful probe of a large part, 
if not all, of the hitherto 'blind spot' of collider searches in the CP-violating MSSM, over 
and above the CPX region. 

The purpose of this work was to establish that the Higgs-tagging algorithm can be used 
effectively to probe the 'blind spot' of the CP-violating MSSM, and we believe that has 
been adequately established by our getting positive results at all the six benchmark points 
chosen. A more detailed mapping of the 'blind spot' is in progress |17]. In fact, we have 
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already remarked that this method can also probe other models with light Higgs bosons, for 
the Higgs-tagging algorithm works so long as these Higgs bosons are boosted and decay to 
bb pairs. 

Eventually, we hope that this powerful technique will be taken up by the experimental 
collaborations and applied to real data, instead of Monte Carlo simulations as was done in 
this exploratory study. The most exciting possibility, of course, would be if experimental 
searches could use this technique to actually find a light Higgs resonance, which has been 
missed in all collider searches so far. If not, we will have to settle, as usual, for a more 
constrained parameter space for the model in question. In either case, the present technique 
could be the key to accessing the region which previous searches could not, and that alone 
represents a modest degree of progress. 
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